Validation of average error rate over classifiers

نویسنده

  • Eric Bax
چکیده

We examine methods to estimate the average and variance of test error rates over a set of classi ers We begin with the process of drawing a classi er at random for each example Given validation data the average test error rate can be estimated as if validating a single classi er Given the test example inputs the variance can be computed exactly Next we consider the process of drawing a classi er at random and using it on all examples Once again the expected test error rate can be validated as if validating a single classi er However the variance must be estimated by validating all classifers which yields loose or uncertain bounds

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble Validation: Selectivity has a Price, but Variety is Free

If classifiers are selected from a hypothesis class to form an ensemble, bounds on average error rate over the selected classifiers include a component for selectivity, which grows as the fraction of hypothesis classifiers selected for the ensemble shrinks, and a component for variety, which grows with the size of the hypothesis class or in-sample data set. We show that the component for select...

متن کامل

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

ar X iv : 1 61 0 . 01 23 4 v 1 [ st at . M L ] 4 O ct 2 01 6 Ensemble Validation : Selectivity has a Price , but Variety is Free

If classifiers are selected from a hypothesis class to form an ensemble, bounds on average error rate over the selected classifiers include a component for selectivity, which grows as the fraction of hypothesis classifiers selected for the ensemble shrinks, and a component for variety, which grows with the size of the hypothesis class or in-sample data set. We show that the component for select...

متن کامل

Optimal classifier selection and negative bias in error rate estimation: an empirical study on high-dimensional prediction

BACKGROUND In biometric practice, researchers often apply a large number of different methods in a "trial-and-error" strategy to get as much as possible out of their data and, due to publication pressure or pressure from the consulting customer, present only the most favorable results. This strategy may induce a substantial optimistic bias in prediction error estimation, which is quantitatively...

متن کامل

SVM-PSO based Feature Selection for Improving Medical Diagnosis Reliability using Machine Learning Ensembles

Improving accuracy of supervised classification algorithms in biomedical applications, especially CADx, is one of active area of research. This paper proposes construction of rotation forest (RF) ensemble using 20 learners over two clinical datasets namely lymphography and backache. We propose a new feature selection strategy based on support vector machines optimized by particle swarm optimiza...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 19  شماره 

صفحات  -

تاریخ انتشار 1998